Incremental Purposive Behavior Acquisition based on Modular Learning System

نویسندگان

  • Tomoki Nishi
  • Yasutake Takahashi
  • Minoru Asada
چکیده

Abstract. A simple and straightforward application of reinforcement learning methods to real robot tasks is considerably difficult due to a huge exploration space that easily scales up exponentially since recent robots tend to have many kinds of sensors. One of the potential solutions might be application of so-called “mixture of experts” proposed by Jacobs et al.[1]; it decomposes a whole state space to a number of areas so that each expert module can produce good performance in the assigned small area. This idea is very general and has a wide range of applications, however, we have to consider how to decompose the space to a number of small regions, assign each of them to a learning module or an expert, and define a goal for each of them. In order to cope with the issue, this paper presents a method of self task decomposition for modular learning system based on self-interpretation of instructions given by a coach. Unlike the conventional approaches, the system decomposes a long-term task into short-term subtasks so that one learning module with limited computational resources can acquire a purposive behavior for one of these subtasks. Since instructions are given from a viewpoint of coach who has no idea how the system learns, they are interpreted by the learner to find the candidates for subgoals. Finally, the top layer of the hierarchical reinforcement learning system coordinates the lower learning modules to accomplish the whole task. The method is applied to a simple soccer situation in the context of RoboCup.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Behavior Acquisition in RoboCup Middle Size League Domain

The RoboCup middle size league is one of the leagues that have the longest histories in RoboCup. This league has unique features, for example, bigger robots (around 45cm square) plays on the largest field (say, 18m×12m in 2007), any global sensory system is not allowed to use, all robots have on-board vision systems and controllers. Each robot plays based on its own sensory information, and it ...

متن کامل

A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System

In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...

متن کامل

Developmental Approach to Spatial Perception for Imitation Learning: Incremental Demonstrator’s View Recovery by Modular Neural Network

Imitation Learning is not simply one of the most promising ways to accelerate the behavior acquisition for humanoid robots but also one of the most interesting cognitive issues to model how we human beings learn to acquire various kinds of behaviors. As the first step towards developmental approach to spatial perception for imitation learning, this paper proposes a method of incremental recover...

متن کامل

Reasonable performance in less learning time by real robot based on incremental state space segmentation

Reinforcement learning has recently been receiving increased attention as a method for robot learning with little or no a priori knowledge and higher capability of reactive and adaptive behaviors. However, there are two major problems in applying it to real robot tasks: how to construct the state space, and how to reduce the learning time. This paper presents a method by which a robot learns pu...

متن کامل

An Improved Modular Modeling for Analysis of Closed-Cycle Absorption Cooling Systems

A detailed modular modeling of an absorbent cooling system is presented in this paper. The model including the key components is described in terms of design parameters, inputs, control variables, and outputs. The model is used to simulate the operating conditions for estimating the behavior of individual components and system performance, and to conduct a sensitivity analysis based on the give...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006